Managing ETL Processes

نویسندگان

  • Alexander Albrecht
  • Felix Naumann
چکیده

ETL tools allow the definition of sometimes complex processes to extract, transform, and load heterogeneous data into a data warehouse or to perform other data migration tasks. In larger organizations many ETL processes of different data integration projects are accumulated. Such processes can encompass common sub-processes, shared data sources and targets, and same or similar operations. However, there is no common method or approach to systematically manage such ETL processes. We propose the highlevel management of such processes as a generic approach to enable their flexible re-use, optimization, and rapid development. To this end we introduce a set of basic operators on ETL processes, such as merge or invert, and motivate their use in several scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

METL: Managing and Integrating ETL Processes

Companies use Extract-Transform-Load (Etl) tools to save time and costs when developing and maintaining data migration tasks. Etl tools allow the definition of often complex processes to extract, transform, and load heterogeneous data into a data warehouse or to perform other data migration tasks. In larger organizations many Etl processes of different data integration and warehouse projects ac...

متن کامل

Modeling and managing ETL processes

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. The design, development and deployment of ETL processes, which is currently, performed in an ad-hoc, in house fashion, needs modeling, design and methodological foundations. Unfortunately, the resear...

متن کامل

An Integrated Conceptual Model for Temporal Data Warehouse Security

In the past few years, several conceptual approaches have been proposed for the specification of the main multidimensional (MD) properties of the data warehouse (DW) repository. However, most of them deal with isolated aspects of the DW and do not provide designers with an integrated and standard method for designing the whole DW life cycle (ETL processes, data sources, DW repository and so on)...

متن کامل

Quarry: Digging Up the Gems of Your Data Treasury

The design lifecycle of a data warehousing (DW) system is primarily led by requirements of its end-users and the complexity of underlying data sources. The process of designing a multidimensional (MD) schema and back-end extracttransform-load (ETL) processes, is a long-term and mostly manual task. As enterprises shift to more real-time and ’onthe-fly’ decision making, business intelligence (BI)...

متن کامل

Improve Performance of Extract, Transform and Load (ETL) in Data Warehouse

Extract, transform and load (ETL) is the core process of data integration and is typically associated with data warehousing. ETL tools extract data from a chosen source, transform it into new formats according to business rules, and then load it into target data structure. Managing rules and processes for the increasing diversity of data sources and high volumes of data processed that ETL must ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008